Subset Seed Automaton
نویسندگان
چکیده
We study the pattern matching automaton introduced in [1] for the purpose of seed-based similarity search. We show that our definition provides a compact automaton, much smaller than the one obtained by applying the Aho-Corasick construction. We study properties of this automaton and present an efficient implementation of the automaton construction. We also present some experimental results and show that this automaton can be successfully applied to more general situations.
منابع مشابه
2 7 Ja n 20 06 A unifying framework for seed sensitivity and its application to subset seeds ( Extended abstract )
We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem – a set of target alignments, an associated probability distribution, and a seed model – that are specified by distinct finite automata. The approach is then applied to a new concept of subset seeds for which ...
متن کاملin ri a - 00 00 11 64 , v er si on 1 - 2 4 M ar 2 00 6 A unifying framework for seed sensitivity and its application to subset seeds ( Extended abstract )
We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem – a set of target alignments, an associated probability distribution, and a seed model – that are specified by distinct finite automata. The approach is then applied to a new concept of subset seeds for which ...
متن کاملA unifying framework for seed sensitivity and its application to subset seeds (Extended abstract)
We propose a general approach to compute the seed sensitivity, that can be applied to di erent de nitions of seeds. It treats separately three components of the seed sensitivity problem { a set of target alignments, an associated probability distribution, and a seed model { that are speci ed by distinct nite automata. The approach is then applied to a new concept of subset seeds for which we pr...
متن کاملA Unifying Framework for Seed Sensitivity and Its Application to Subset Seeds
We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem--a set of target alignments, an associated probability distribution, and a seed model--that are specified by distinct finite automata. The approach is then applied to a new concept of subset seeds for which we...
متن کاملin ri a - 00 17 04 14 , v er si on 1 - 7 S ep 2 00 7 Subset seed automaton
We study the pattern matching automaton introduced in [1] for the purpose of seed-based similarity search. We show that our definition provides a compact automaton, much smaller than the one obtained by applying the Aho-Corasick construction. We study properties of this automaton and present an efficient implementation of the automaton construction. We also present some experimental results and...
متن کامل